Finding Semantically-Equivalent Binary Code By Synthesizing Adaptors

نویسندگان

  • Vaibhav Sharma
  • Kesha Hietala
  • Stephen McCamant
چکیده

Independently developed codebases typically contain many segments of code that perform same or closely related operations (semantic clones). Finding functionally equivalent segments enables applications like replacing a segment by a more efficient or more secure alternative. Such related segments often have different interfaces, so some glue code (an adapter) is needed to replace one with the other. We present an algorithm that searches for replaceable code segments at the function level by attempting to synthesize an adapter between them from some family of adapters; it terminates if it finds no possible adapter. We implement our technique using (1) concrete adapter enumeration based on Intel’s Pin framework (2) binary symbolic execution, and explore the relation between size of adapter search space and total search time. We present examples of applying adapter synthesis for improving security and efficiency of binary functions, deobfuscating binary functions, and switching between binary implementations of RC4. We present two large-scale evaluations, (1) we run adapter synthesis on more than 13,000 function pairs from the Linux C library, (2) using more than 61,000 fragments of binary code extracted from a ARM image built for the iPod Nano 2g device and known functions from the VLC media player, we evaluate our adapter synthesis implementation on more than a million synthesis tasks . Our results confirm that several instances of adaptably equivalent binary functions exist in real-world code, and suggest that adapter synthesis can be applied for reverse engineering and for constructing cleaner, less buggy, more efficient programs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

(State of) The Art of War: Offensive Techniques in Binary Analysis

Finding and exploiting vulnerabilities in binary code is a challenging task. The lack of high-level, semantically rich information about data structures and control constructs makes the analysis of program properties harder to scale. However, the importance of binary analysis is on the rise. In many situations binary analysis is the only possible way to prove (or disprove) properties about the ...

متن کامل

Memoized Semantics-Based Binary Diffing with Application to Malware Lineage Inference

Identifying differences between two executable binaries (binary diffing) has compelling security applications, such as software vulnerability exploration, “1-day” exploit generation and software plagiarism detection. Recently, binary diffing based on symbolic execution and constraint solver has been proposed to look for the code pairs with the same semantics, even though they are ostensibly dif...

متن کامل

Capacity-Approaching Joint Source-Channel Coding for Asymmetric Channels with Low-Density Parity-Check Codes

By only sending the parity bits, joint source-channel coding can be natively achieved with lowdensity parity-check codes. However, the code ensemble design of optimal low-density parity-check codes for joint source-channel coding over asymmetric communication channels is difficult. To circumvent such a difficulty, source-channel adaptors is proposed in this paper. By using the source-channel ad...

متن کامل

Conditional Equivalence

A typical software module evolves through many versions over the course of its development. To maintain compatibility with module clients, it is crucial that a module’s behavior at its interface does not change in an undesirable manner across versions. The problem of introducing changes which break interface behavior remains one of the most daunting challenges in the maintenance of large softwa...

متن کامل

Code Obfuscation using Code Splitting with Self-modifying Code

Code Obfuscation is a protection technique that transforms the software into a semantically equivalent one which is strenuous to reverse engineer. As a part of software protection and security, code obfuscation got commercial interest from both vendors’ side to keep their proprietary as secret and customers’ side to have a trusted software that don’t leek or destroy their personal information. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1707.01536  شماره 

صفحات  -

تاریخ انتشار 2016